Query Expansion with Naive Bayes for Searching Distributed Collections
نویسندگان
چکیده
The proliferation of online information resources increases the importance of effective and efficient distributed searching. However, the problem of word mismatch seriously hurts the effectiveness of distributed information retrieval. Automatic query expansion has been suggested as a technique for dealing with the fundamental issue of word mismatch. In this paper, we propose a method query expansion with Naive Bayes to address the problem, discuss its implementation in IISS system, and present experimental results demonstrating its effectiveness. Such technique not only enhances the discriminatory power of typical queries for choosing the right collections but also hence significantly improves retrieval results.
منابع مشابه
Eeective Retrieval with Distributed Collections
This paper evaluates the retrieval eeective-ness of distributed information retrieval systems in realistic environments. We nd that when a large number of collections are available, the retrieval eeectiveness is signiicantly worse than that of centralized systems, mainly because typical queries are not adequate for the purpose of choosing the right collections. We propose two techniques to addr...
متن کاملDiagnosis of Pulmonary Tuberculosis Using Artificial Intelligence (Naive Bayes Algorithm)
Background and Aim: Despite the implementation of effective preventive and therapeutic programs, no significant success has been achieved in the reduction of tuberculosis. One of the reasons is the delay in diagnosis. Therefore, the creation of a diagnostic aid system can help to diagnose early Tuberculosis. The purpose of this research was to evaluate the role of the Naive Bayes algorithm as a...
متن کاملQuery Expansion for the Language Modelling Framework Using the Naïve Bayes Assumption
Language modelling is new form of information retrieval that is rapidly becoming the preferred choice over probabilistic and vector space models, due to the intuitiveness of the model formulation and its effectiveness. The language model assumes that all terms are independent, therefore the majority of the documents returned to the ser will be those that contain the query terms. By making this ...
متن کاملA New Approach for Text Documents Classification with Invasive Weed Optimization and Naive Bayes Classifier
With the fast increase of the documents, using Text Document Classification (TDC) methods has become a crucial matter. This paper presented a hybrid model of Invasive Weed Optimization (IWO) and Naive Bayes (NB) classifier (IWO-NB) for Feature Selection (FS) in order to reduce the big size of features space in TDC. TDC includes different actions such as text processing, feature extraction, form...
متن کاملCAP7: Searching and Browsing in Distributed Document Collections
This paper describes CAP7, a system for searching and browsing in distributed document (metadata) collections. The system architecture is similar to Harvest, comprising gatherer components and a retrieval engine; but instead of the limited SOIF data format, we use RDF and XML. The gatherer creates RDF metadata descriptions of collected resources. Before delivering the data to the retrieval engi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002